Video telepresence technologies enable individuals to communicate using audio and video. Such technologies generally involve capturing a video and audio of a first individual located at a first location, transmitting the video and audio over a network to a second individual located at a second location, and outputting the video and audio to the second individual. The first individual may also receive video and audio of the second individual. In this manner, the individuals may use cameras, display screens, microphones, and other equipment to facilitate a real-time conversation. However, the video telepresence technologies often provide relatively little insights into the content being displayed.